19 research outputs found

    ICFHR2016 Handwritten Keyword Spotting Competition (H-KWS 2016)

    Full text link
    © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.[EN] The H-KWS 2016, organized in the context of the ICFHR 2016 conference aims at setting up an evaluation framework for benchmarking handwritten keyword spotting (KWS) examining both the Query by Example (QbE) and the Query by String (QbS) approaches. Both KWS approaches were hosted into two different tracks, which in turn were split into two distinct challenges, namely, a segmentation-based and a segmentation-free to accommodate different perspectives adopted by researchers in the KWS field. In addition, the competition aims to evaluate the submitted training-based methods under different amounts of training data. Four participants submitted at least one solution to one of the challenges, according to the capabilities and/or restrictions of their systems. The data used in the competition consisted of historical German and English documents with their own characteristics and complexities. This paper presents the details of the competition, including the data, evaluation metrics and results of the best run of each participating methods.This work was partially supported by the Spanish MEC under FPU grant FPU13/06281, by the Generalitat Valenciana under the Prometeo/2009/014 project grant ALMA-MATER, and through the EU projects: HIMANIS (JPICH programme, Spanish grant Ref. PCIN-2015-068) and READ (Horizon-2020 programme, grant Ref. 674943).Pratikakis, I.; Zagoris, K.; Gatos, B.; Puigcerver, J.; Toselli, AH.; Vidal, E. (2016). ICFHR2016 Handwritten Keyword Spotting Competition (H-KWS 2016). IEEE. https://doi.org/10.1109/ICFHR.2016.0117

    Transforming scholarship in the archives through handwritten text recognition:Transkribus as a case study

    Get PDF
    Purpose: An overview of the current use of handwritten text recognition (HTR) on archival manuscript material, as provided by the EU H2020 funded Transkribus platform. It explains HTR, demonstrates Transkribus, gives examples of use cases, highlights the affect HTR may have on scholarship, and evidences this turning point of the advanced use of digitised heritage content. The paper aims to discuss these issues. - Design/methodology/approach: This paper adopts a case study approach, using the development and delivery of the one openly available HTR platform for manuscript material. - Findings: Transkribus has demonstrated that HTR is now a useable technology that can be employed in conjunction with mass digitisation to generate accurate transcripts of archival material. Use cases are demonstrated, and a cooperative model is suggested as a way to ensure sustainability and scaling of the platform. However, funding and resourcing issues are identified. - Research limitations/implications: The paper presents results from projects: further user studies could be undertaken involving interviews, surveys, etc. - Practical implications: Only HTR provided via Transkribus is covered: however, this is the only publicly available platform for HTR on individual collections of historical documents at time of writing and it represents the current state-of-the-art in this field. - Social implications: The increased access to information contained within historical texts has the potential to be transformational for both institutions and individuals. - Originality/value: This is the first published overview of how HTR is used by a wide archival studies community, reporting and showcasing current application of handwriting technology in the cultural heritage sector

    Word Spotting as a Service: An Unsupervised and Segmentation-Free Framework for Handwritten Documents

    No full text
    Word spotting strategies employed in historical handwritten documents face many challenges due to variation in the writing style and intense degradation. In this paper, a new method that permits efficient and effective word spotting in handwritten documents is presented that relies upon document-oriented local features that take into account information around representative keypoints and a matching process that incorporates a spatial context in a local proximity search without using any training data. The method relies on a document-oriented keypoint and feature extraction, along with a fast feature matching method. This enables the corresponding methodological pipeline to be both effectively and efficiently employed in the cloud so that word spotting can be realised as a service in modern mobile devices. The effectiveness and efficiency of the proposed method in terms of its matching accuracy, along with its fast retrieval time, respectively, are shown after a consistent evaluation of several historical handwritten datasets

    Ανάκτηση εγγράφων βάσει περιεχομένου και mpeg-7 μεταδομένων

    No full text
    inside the image. From every such block a descriptor is extracted which it is constructed from a set of document structures elements. Also, the length of the descriptor can be reduced from the 510 initial DSEs to any number using an algorithm called Feature Standard Deviation Analysis of Structure Elements (FSDASE). Finally, the output of the SVM is using the descriptors to classify each block as text or not and extract those blocks from the original image or locate them on it. The proposed technique has the ability to adapt to the peculiarities of each document images database since the features adjust to it. It provides, also, the ability to increase or decrease text localization speed by the manipulation of the block descriptor length. The fourth technique encounters the document retrieval problem using a word matching procedure. This technique performs the word matching directly in the document images bypassing OCR and using word-images as queries. The entire system consists of the Offline and the Online procedures. In the Offline procedure which it is transparent to the user, the document images are analyzed and the results are stored in a database. This procedure consists of three main stages. Initially, the document images pass the preprocessing stage which consists of a Median filter, in order to face the existence of noise e.g in case of historical or badly maintained documents, and the Otsu binarization method. The word segmentation stage follows the preprocessing stage. Its primary goal is to detect the word limits. This is accomplished by using the Connected Components Labeling and Filtering method. A set of features, capable of capturing the word shape and discard detailed differences due to noise or font differences are used for the word-matching process. These features are: Width to Height Ratio, Word Area Density, Center of Gravity, Vertical Projection, Top - Bottom Shape Projections, Upper Grid Features, Down Grid Features. Finally, these features create a 93-dimention vector that is the word descriptor and it is stored in a database. In the Online procedure, the user enters a query word and the proposed system creates an image from it with font height equal to the average height of all the word-boxes obtained through Offline operation. Then, the system calculates the descriptor of the query word image. Finally, the system using the Minkowski L1 distance presents the documents that contain the words which their descriptors are closest to the query descriptor. The experimental results show that the proposed system performs better than a commercial OCR package. The last method involves a MPEG-like compact shape descriptor that contains conventional contour and region shape features with a wide applicability from any arbitrary shape to document retrieval through word spotting. It is called Compact Shape Portrayal Descriptor and its computation can be easily parallize as each feature can be calculated separately. These features are the Width to Height Ratio, Vertical - Horizontal Projections, Top - Bottom Shape Projections which construct a 41 dimension descriptor.Στα τελευταία χρόνια υπάρχει ταχεία ανάπτυξη του μεγέθους των πολυμεσικών δεδομένων λόγω της ευκολίας δημιουργίας τους. Ένα από τα κυριότερα συστατικά των πολυμεσικών δεδομένων είναι οι ψηφιακές εικόνες. Καθημερινά, παράγονται giga-bytes εικόνων, με αποτέλεσμα, να δημιουργούνται τεράστια μεγέθη πληροφορίας. Η αποτελεσματική εκμετάλλευση όλης αυτής της πληροφορίας απαιτεί έξυπνες τεχνικές και νέα τεχνολογία. Για το σκοπό αυτό, η αποθήκευση των πολυμεσικών πληροφοριών πρέπει να οργανωθεί με τέτοιον τρόπο ώστε να επιτρέπει την αποδοτική πλοήγηση, αναζήτηση και ανάκτησή τους. Η παρούσα διατριβή παρουσιάζει πέντε τεχνικές που βελτιώνουν τα συστήματα ανάκτησης εικόνων με βάση το περιεχόμενό τους. Η πρώτη τεχνική μειώνει τις χρωματικές αποχρώσεις μιας εικόνας με την χρησιμοποίηση στατιστικών συστάδων (Clustering) συνδυάζοντας το νευρωνικό ταξινομητή Kohonen Self-Organized Feature Map (KSOFM) και τον ασαφή ταξινομητή Gustafson - Kessel (GK). Αρχικά, οι χρωματικές αποχρώσεις μειώνονται με τον KSOFM και οι εξαγόμενες χρωματικές κλάσεις του αρχικοποιούν τον ασαφή αλγόριθμο GK. Τα τελικά αποτελέσματα του GK ορίζουν και τη χρωματική παλέτα της τελικής εικόνας. Η προτεινόμενη τεχνική έχει την ικανότητα να διατηρεί τα κύρια χρώματα μιας εικόνας ακόμα και αν το πλήθος αυτών είναι πολύ μικρό. Επίσης ενοποιεί περιοχές που έχουν παρόμοια χρώματα. Με βάση τα παραπάνω, μπορεί να θεωρηθεί ως μία ισχυρή τεχνική κατάτμησης έγχρωμης ψηφιακής εικόνας. Η δεύτερη μέθοδος που προτείνεται ασχολείται με την ανάδραση με βάση τη συνάφεια, η οποία στηρίζεται σε τέσσερις περιγραφείς όμοιους με αυτούς του MPEG-7. Πολλές φορές ο χρήστης δεν ξέρει τι εικόνα ψάχνει ακριβώς αλλά έχει μία γενική ιδέα. Οπότε θα πρέπει το σύστημα να παρέχει στον χρήστη ένα τρόπο αλληλοεπίδρασης με αυτό. Αρχικά παρέχονται στο χρήστη το αρχικό σύνολο των αποτελεσμάτων της ανάκτησης και στη συνέχεια ο χρήστης μπορεί να επιλέξει από αυτά εκείνα που τον ενδιαφέρουν. Το Σύστημα Ανάκτησης χρησιμοποιεί αυτήν την πληροφορία ώστε να βελτιώσει τα αρχικά αποτελέσματα. Αυτό πραγματοποιείται με τον μετασχηματισμό του διανύσματος του περιγραφέα των εικόνων σε ένα άλλο διάνυσμα με βάση τα εσωτερικά χαρακτηριστικά του. Σε αυτόν θα αποθηκεύεται η πληροφορία που θα δίνεται από τον χρήστη ενώ οι αρχικές τιμές που περιέχει είναι οι τιμές του περιγραφέα της εικόνας - ερώτημα. Όταν ο χρήστης επιλέγει μία εικόνα από τα αποτελέσματα της αρχικής ανάκτησης, το διάνυσμα του περιγραφέα της αλλάζει τις τιμές του μετασχηματισμένου διανύσματος. Τα καινούργια αποτελέσματα ανάκτησης εικόνων δημιουργούνται θεωρώντας ως περιγραφέα-ερώτημα αυτόν που είναι αποθηκευμένος στο μετασχηματισμένο διάνυσμα. Η προτεινόμενη τεχνική βελτιώνει τα αποτελέσματα της αρχικής ανάκτησης με μικρό ii υπολογιστικό κόστος. Η τρίτη τεχνική ασχολείται με τον εντοπισμό του κειμένου σε εικόνες - έγγραφα. Δηλαδή, προτείνεται μία μέθοδος εντοπισμού ομοιόμορφου κειμένου, η οποία στηρίζεται στα συνδεδεμένα στοιχεία για τον εντοπισμό των αντικειμένων, στα δομικά συστατικά των εγγράφων για τη δημιουργία του περιγραφέα των αντικειμένων και στα Support Vector Machines για την επιλογή αυτών που θεωρούνται ως κείμενο. Επιπλέον έχει την ικανότητα να προσαρμόζεται στις ιδιαιτερότητες της κάθε βάσης εγγράφων - εικόνων

    Text localization using standard deviation analysis of structure elements and support vector machines

    No full text
    <p>Abstract</p> <p>A text localization technique is required to successfully exploit document images such as technical articles and letters. The proposed method detects and extracts text areas from document images. Initially a connected components analysis technique detects blocks of foreground objects. Then, a descriptor that consists of a set of suitable document structure elements is extracted from the blocks. This is achieved by incorporating an algorithm called Standard Deviation Analysis of Structure Elements (SDASE) which maximizes the separability between the blocks. Another feature of the SDASE is that its length adapts according to the requirements of the application. Finally, the descriptor of each block is used as input to a trained support vector machines that classify the block as text or not. The proposed technique is also capable of adjusting to the text structure of the documents. Experimental results on benchmarking databases demonstrate the effectiveness of the proposed method.</p

    Word Spotting as a Service: An Unsupervised and Segmentation-Free Framework for Handwritten Documents

    No full text
    Word spotting strategies employed in historical handwritten documents face many challenges due to variation in the writing style and intense degradation. In this paper, a new method that permits efficient and effective word spotting in handwritten documents is presented that relies upon document-oriented local features that take into account information around representative keypoints and a matching process that incorporates a spatial context in a local proximity search without using any training data. The method relies on a document-oriented keypoint and feature extraction, along with a fast feature matching method. This enables the corresponding methodological pipeline to be both effectively and efficiently employed in the cloud so that word spotting can be realised as a service in modern mobile devices. The effectiveness and efficiency of the proposed method in terms of its matching accuracy, along with its fast retrieval time, respectively, are shown after a consistent evaluation of several historical handwritten datasets
    corecore